Binary Neural Networks Algorithms, Architectures, and Applications (Baochang Zhang, Sheng Xu, Mingbao Lin etc.)

IDa-Det: An Information Discrepancy-Aware Distillation for 1-bit Detectors

171

where L^GTis the detection loss derived from the ground truth label and L^Limis the ﬁne-

grained feature limitation deﬁned in [235]. The LWS-Det process is outlined in Algorithm

13.

6.4.5

Ablation Study

Eﬀectiveness of DBS. We ﬁrst compare our DBS method with three other methods to

produce binarized weights–Random Search [277], Sign [99], and RSign [158]. As shown in

Table 6.4, we evaluate the eﬀectiveness of DBS on two detectors: one-stage SSD and two-

stage Faster-RCNN. On the Faster-RCNN detector, the usage of DBS improves the mAP

by 8.1%, 4.3%, and 9.1% compared to Sign, RSign, and Random Search, respectively, under

the same student-teacher framework. On the SSD detector, DBS also enhances mAP by

5.5%, 3.3% and 11.3% compared to other binarization methods, respectively, which is very

signiﬁcant for the object detection task.

Convergence analysis. We evaluate the convergence of detection loss during the training

process compared to other situations on two detectors: Faster-RCNN with ResNet-18 back-

bone and SSD with VGG-16 backbone. As plotted in Fig. 6.12, the LWS-Det training curve

based on random search oscillates vigorously, which is suspected to be triggered by a less

optimized angular error resulting from the randomly searched binary weights. Additionally,

our DBS achieves a minimum loss during training compared to Sign and RSign. This also

conﬁrms that our DBS method can binarize the weights with minimum angular error, which

explains the best performance in Table 6.4.

6.5

IDa-Det: An Information Discrepancy-Aware Distillation for

1-bit Detectors

The recent art [264] employs ﬁne-grained feature imitation (FGFI) [235] to enhance the

performance of 1-bit detectors. However, it neglects the intrinsic information discrepancy

between 1-bit detectors and real-valued detectors. As shown in Fig. 6.13, we demonstrate

that saliency maps of real-valued Faster-RCNN of the ResNet-101 backbone (often used as

the teacher network) and the ResNet-18 backbone, compared to 1-bit Faster-RCNN of the

ResNet-18 backbone (often used as the student network) from top to bottom. They show

TABLE 6.4

Ablation study: comparison of the performance of diﬀerent

binarization methods with DBS.

Framework

Backbone

Binarization Method

mAP

Faster-RCNN

ResNet-18

Sign

65.1

RSign

68.9

Random Search

64.1

DBS

73.2

Real-valued

76.4

SSD

VGG-16

Sign

65.9

RSign

68.1

Random Search

60.1

DBS

71.4

Real-valued

74.3